Olympics

Olympic Rings

Olympic Rings

Intro

The Olympics are a worldwide celebration of the best athletes in the world that promotes peace and unity. There are no honors greater for athletes than winning a gold medal for their country. This incredible achievement may be influenced by obvious variables such as physical traits of the athletes, the type of olympics (summer or winter), or variables that one would not think are immediately related to sports. We suspect that factors such as a country’s total GDP and Population. We explore if these factors influence the number of medals won in the Olympics in this blog. Our aim is to deepen our understanding of what factors influence a country’s success in the games and spread the information in hopes that there may be a solution to even the playing field.

USA Medal Exploration

The above medal time graph demonstrates the number of medals that the United States won each year that the Olympics were held. This allows us to see how the United States has performed at the Olympic stage. The graph has very sharp curves up and down which represent the medal differences between the summer and winter olympics. The USA tends to win more medals in the summer Olympics than the Winter olympics.

2016 Olympic Data Analysis

The graph above depicts the relationship between a country’s GDP and its medal count at the 2016 Olympic Games. It is evident that a significant cluster of countries with low GDP and low medal count exists, indicating that many small countries were not large medal winners. The data suggests that only two countries, the United States and China, managed to surpass the 50 medal count threshold, which correlates with their relatively high GDP. Additionally, the size of the points on the graph corresponds to a country’s population, but it is clear that population is not as predictive of medal count as GDP. This is evident as there are large and small points at both ends of the y-axis (medal count).

GDP and Population Country Maps (2010)

GDP

This map displays Total GDP in 2010 for countries around the world. Lighter shades indicate a lower total GDP, while darker shades represent a higher GDP. Countries such as China, The United States of America, and Japan are among the countries with the highest Total GDP. Users can hover over the country of their choice to identify the country’s name and its Total GDP.

Population

This map displays populations for countries around the world in 2010. Lighter shades indicate a relatively low population, while dark shades indicate a relatively high population. China and India are among the countries with the highest total population by a considerable margin. Users can hover over the country of their choice to identify the country’s name and its population. ## {-}

Population and Medals won in 2016 Olympics

This map displays the medals won in the 2016 Olympics versus the population of countries around the world in 2016. Lighter shades indicate a relatively low number of medals when compared to the population, while dark shades indicate a high number of medals when compared to the population. Grey represents the countries who did not win medals during the 2016 Olympics. Georgia, Azerbaijan, and Denmark are among the countries with the highest number of medals in context of their populations. Countries with high GDPs or populations such as the United States, China, Japan, and Brazil all won medals during the Olympics. Users can hover over the countries to view the country’s name and Medals won in comparison to their population. The maps were created using ggplot in R.

Athlete Medal Analysis

Height

cluster min Q1 median Q3 max mean sd n missing
1 136 163 167 170 179 166.01 6.08 9903 0
2 164 187 190 195 223 191.33 6.66 6679 0
3 162 176 180 183 203 179.39 4.73 13599 0

Weight

cluster min Q1 median Q3 max mean sd n missing
1 28 55 59 64 90 58.80 6.84 9903 0
2 70 87 92 98 182 94.05 11.06 6679 0
3 54 70 75 79 104 74.68 6.29 13599 0

US Athletes

Height

cluster min Q1 median Q3 max mean sd n
1 136 163 167 170 179 166.01 6.08 9903
2 164 187 190 195 223 191.33 6.66 6679
3 162 176 180 183 203 179.39 4.73 13599

Weight

cluster min Q1 median Q3 max mean sd n
1 28 55 59 64 90 58.80 6.84 9903
2 70 87 92 98 182 94.05 11.06 6679
3 54 70 75 79 104 74.68 6.29 13599

These bar graphs pull from a dataset containing every Olympic medal winner in the competition’s history (since data collection began). As you can see, the data is clustered. I clustered the data by height and weight, and measured the total medal count for each cluster, separated by the type of medal. The first bargrah displays all athletss, where as the second bar graph only displays American athletes.

All Athletes Clustered

The goal here is to observe how height and weight may hold an impact on whether or not an athlete will be successful in the olympics. For the dataset containing all athletes, the 3rd cluster observed the most medaling. From this, we can conclude that the middle third of athletes by height and weight are more likely to medal than the upper and lower thirds. In a similar vein, the lowest third of athletes by height and weight are more likely to medal than the highest third. Generally speaking, athletes in the middle ground in terms of height and wight have been, historically, more successful at the Olympics than others.

Similarly, we can observe individual types of medals across these three clusters. The same trend I described before is generally true for each type of medal, with the middle ground height and weight athletes winning the most gold, silver, and bronze medals. Though, the difference between the first two clusters is much less major.

American Athletes Clustered

The same trend observed in the prior bar graph remains true here. For american athletes, the 1st cluster contains the most medaling. As before, this is the middle third of athletes clustered by height and weight.

Limiting the data to only include american athletes may give more credence to the initial observation that athletes who fall into that middle ground in terms of height and weight have historically been the most successful athletes.

Limitations:

First off, different sports call for different types of physical builds. For example, a power lifter competing in the olympics needs to be big. On the contrary, a gymnast needs to be much lighter and lean, and in reality, they are often times much shorter as well. These physical traits give athletes competitive advantages in their own sports, therefore, it would be unfair to assert that one specific type of height and weight caters to the most success. Overall, that may be the case, however once you analyze these sports as individuals we see that these clusters don’t give us as definitive results.

Similarly, we can not account for those athletes who compete as individuals versus those who compete as apart of a team. A gymnast, for example, competes as an individual. These athletes only rely on themselves for success during the competition. On the other hand, a basketball athlete relies on themselves as well as their teammates. Additionally, athletes on a team potentially fill certain physical builds that complement the builds of their teammates. On a basketball team, you will have a point-guard who are typically shorter and lighter along with a center who are typically taller and heavier. So, the fact that we are unable to distinguish between team athletes and individual athletes is another limitation of our k-means clustering analysis.

Conclusion

Evidently, we have chosen a quite broad discussion topic. While we are interested in looking at country specific trends relating to the olympics, such as our first graph where we observe medal count for just the United States, we are also dive into quite universal trends amongst athletes throughout the world, such as our graphs on GDP and height and weight.

We take this broad approach so as to identify multiple factors which impact a nation or individual athlete’s ability to be successful at the Olympics. It seems clear that, at the national level, economic strength plays a major role in the success of a nation at the Olympics. There are many reasons why this may be true, however, it is clear that the more money a nation has, the more money they can pour into resources for building up incredible athletes. Evidently, nations with higher GDPs have experienced more success at the Olympics. On the other hand, however, these economic factors have very little impact on the very natural aspects of an athlete’s physical traits. At the individual level, it may be true that the middle third of athletes by height and weight have been the most successful.

It would be great to identify which of these factors holds a more marginal impact on success, however, at this very moment it is not clear what the most important factor is. At the very least, it is clear that there are a number of factors which potentially impact success at the Olympics.

References

Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi:10.21105/joss.01686 https://doi.org/10.21105/joss.01686.

Vanderkam D, Allaire J, Owen J, Gromer D, Thieurmel B (2018). dygraphs: Interface to ‘Dygraphs’ Interactive Time Series Charting Library. R package version 1.1.1.6, https://CRAN.R-project.org/package=dygraphs.

H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.

Xie Y, Cheng J, Tan X (2023). DT: A Wrapper of the JavaScript Library ‘DataTables’. R package version 0.27, https://CRAN.R-project.org/package=DT.

Arnold J (2021). ggthemes: Extra Themes, Scales and Geoms for ‘ggplot2’. R package version 4.2.4, https://CRAN.R-project.org/package=ggthemes.

C. Sievert. Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman and Hall/CRC Florida, 2020.

Simon Garnier, Noam Ross, Robert Rudis, Antônio P. Camargo, Marco Sciaini, and Cédric Scherer (2021). Rvision - Colorblind-Friendly Color Maps for R. R package version 0.6.2.

R. Pruim, D. T. Kaplan and N. J. Horton. The mosaic Package: Helping Students to ‘Think with Data’ Using R (2017). The R Journal, 9(1):77-102.

Zhu H (2021). kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.3.4, https://CRAN.R-project.org/package=kableExtra.

Heesoo, K. (2017), “120 years of Olympic history: athletes and results” (Version 2), Kaggle, available at https://www.kaggle.com/heesoo37/120-years-of-olympic-history-athletes-and-results.

Devakumar, K. P. (2021), “World population 1960-2018” (Version 6), Kaggle, available at https://www.kaggle.com/imdevskp/world-population-19602018.

Loong, Ho. (2021), “GDP of each country and region(1960-2020)” (Version 3), Kaggle, available at https://www.kaggle.com/holoong9291/gdp-of-all-countries19602020.